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Method of joint decoding of possibly mutilated code words 



The invention relates to a method of decoding possibly mutilated code words 
of a code, wherein an information word and an address word are encoded into a code word of 
said code using a generator matrix and wherein said address words are selected such that 
address words having a known relationship are assigned to consecutive code words. The 
5 invention relates further to a corresponding apparatus for decoding possibly mutilated code 
words and to a computer program for implementing said method. 

In European patent application 01201841.2 (PHNL 10331), the concept of 

10 coding for informed decoding is described. An enhancement can be found in European patent 
application 01 203 147.2 (PH-NL 010600). The key element of the inventions described in 
said European patent applications is an appropriate selection of the mapping of information 
strings towards code words from the Error Correcting Code (ECC) that is applied. The aim of 
coding for informed decoders is to enable more reliable information retrieval if the decoder a 

1 5 priori knows part of the encoded information. A typical example is in the field of address 
retrieval of optical media. In case of a forced jump to a certain sector, part of the address of 
the sector in which the read/write head will land is known taking the jump accuracy into 
account. For instance in DVR (Digital Video Recording) it is anticipated that on the spiral 
track on the optical disc, a series of messages is encoded which aids in the logical positioning 

20 of the read/write head above the disc, which e.g. contains copyright information, date 

information and a logical track counter. Of this information, only the logical track counter 
changes in between successive messages. Therefore, a priori known information can be 
available from successful decoding of previous messages. 

According to the solutions described in the above mentioned European patent 

25 applications, decoding of a second message can only gain from the decoding of the first 
message, if said first decoding is successful. A second decoding can than in turn aid in the 
decoding of the third message and so on. However, if the decoding of the first message fails, 
the fact that the second message has been encoded in a special way does not improve the 
error correcting capabilities, i.e. the second and any further decoding is not helped by 
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previous decodings. All decodings could then fail. In other situations, for example at the very 
start of a recording or playback session, not much is known about the actual landing place of 
the read/write head. In that case, only the error correcting capabilities of the code can be used 
for retrieving information. 

5 

It is therefore an object of the present invention to provide an improved 
method and apparatus of decoding possibly mutilated code words which can be used if the 
error correcting capabilities of the code are insufficient to allow reliable information retrieval 
10 and which can also be used if there are so many errors that, despite application of informed 
decoding as described in the above mentioned European patent applications, these errors can 
not be corrected. 

This object is achieved according to the present invention by a method of 
decoding as claimed in claim 1, comprising the steps of: 
1 5 - decoding the differences of a number of pairs of possibly mutilated code 

words to obtain estimates for the differences of the corresponding pairs of code words, 

combining said estimates to obtain a number of at least two corrupted versions 
of a particular code word, 

forming a code vector from said number of corrupted versions of said 
20 particular code word in each coordinate, 

decoding said code vector to a decoded code word in said code, and 

using said generator matrix to obtain the information word and the address 
word embedded in said decoded code word. 

A corresponding apparatus for decoding according to the present invention is 
25 claimed in claim 1 1 . A computer program comprising computer program code means for 
causing a computer to perform the steps of the method as claimed in claim 1 when that 
computer program is run on said computer is claimed in claim 12. Preferred embodiments of 
the invention are defined in the dependent claims. 

The present invention is based on the idea to employ certain relationships 
30 between consecutive code words and to jointly decode several such consecutive code words. 
"Consecutive" code words in this context shall mean code words which are read and/or 
inputted into the decoder subsequently, e.g. code words which are located in series next to 
each other in a data stream or which are stored in consecutive sectors on an information 
carrier such as a CD, DVD or DVR or magnetic disc. 
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The proposed method of joint decoding comprises two main elements. A first 
main element comprises the step of obtaining estimates for the differences of various pairs of 
consecutive code words by decoding the differences of their corrupted versions. According to 
a second main element the decoding results of said differences of corrupted pairs of 
5 consecutive code words are combined resulting in a number of corrupted versions of one and 
the same code word. These corrupted versions then are all used for obtaining the desired code 
word from which finally the information word and the address word can be retrieved. Since 
the address words encoded in the code words have a certain known relationship, much can be 
said during decoding about the difference of possibly mutilated code words which knowledge 

10 is advantageously used according to the invention during decoding. 

Preferred embodiments of the step of forming the code vector are defined in 
claims 2 to 4. Preferably, said code vector is formed by majority voting in each coordinate 
from the number of corrupted versions obtained from the estimates. If more than one value 
occurs most frequent among said number of corrected versions the corresponding coordinate 

15 of said code vector is erased according to a further preferred embodiment. Alternatively or in 
addition, reliability information available on the symbols of one or more possibly mutilated 
code words is used for selecting the coordinates of said code vector according to still another 
preferred embodiment. Reliability information could be information about the probability 
that a certain value is correct. If available, it is preferred to include reliability information for 

20 each of the bits of the code vector for enhancing its decoding. 

A preferred way of obtaining an estimate for the difference of a pair of code 
words is to decode the difference of the corresponding pair of possibly mutilated code words 
to the closest code word from a subcode consisting of all possible differences of two 
consecutive code words of the main code which closest code word is then used as said 

25 estimate. 

The probability of incorrect decoding can be further reduced by introducing an 
extra check according to which the obtained estimates are checked if they show a 
predetermined form and/or have a possible value. Preferably, if this check fails, the decoding 
result should be rejected. 
30 The present invention is advantageously used for decoding of code words 

stored on an information carrier, such as an optical or magnetic disc. According to standards 
used for optical recording, such as the CD-DA or the DVR standard, address words assigned 
to consecutive code words are consecutive, e.g. are subsequently increased by one, and 
preferably represent the sector address of the sector in which the corresponding code word is 



stored. Said relationship between the address words assigned to consecutive code words will 
be exploited by the present invention. 

For reducing decoding delay, it is advantageous if the number of pairs of 
possibly mutilated code words to be decoded is as small as possible. However, at least two 
5 pairs of two consecutive possibly mutilated code words have to be decoded so that at least 
three estimates can be obtained since particularly majority voting on two alternatives is not 
useful. If reliability information is used instead of majority voting one pair might be 
sufficient. 

According to another aspect of the present invention it is proposed that in said 
10 step of combining said estimates to obtain a number of corrupted versions of a particular 
code word a first corrupted version corresponds to a first possibly mutilated code word, a 
second corrupted version corresponds to the difference between a second possibly mutilated 
code word and a first estimate, obtained by decoding the difference between said first and 
said second possibly mutilated code words, and a third corrupted version corresponds to the 
15 difference between a third possibly mutilated code word, said first estimate and a second 
estimate, obtained by decoding the difference between said second and said third possibly 

mutilated code words. 

According to still another aspect of the invention the proposed solution could 
also be used in combination with the solutions described in the above mentioned European 

20 patent applications using a priori known information during decoding. A preferred way could 
be that, in a first step, the decoder tries to decode a possibly mutilated code word using a 
priori known information available on the address word embedded in said possibly mutilated 
code word. If this decoding fails or if the result is not reliable enough, then the present 
solution of joint decoding could be used. However, it is also possible that the present solution 

25 is always used in addition to the use of a priori known information during decoding. 



The invention will now be explained more in detail with reference to the 
drawings, in which 

30 Fig. 1 shows the format of a data word information to be encoded, 

Fig. 2 shows the format of a code word, 

FigJL-, shows a block diagram of an encoding and decoding scheme, and 
Fig. 4 shows a block diagram of the method of decoding according to the 
present invention. 
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In the following, it is assumed that information is represented as a k-bits string 
called data word d as shown in Fig. 1, said data word d comprising an address word a(i) and 
5 an information word m. The address word a(i) is a b-bits string representing the sector 

address for sector i; the information word m is a (k-b)-bits string containing any information 
to be stored, such as audio, video, software, copyright or date information or any other kind 
of data. It should be noted that the address word a(i) is known if i is known and vice versa; 
however, knowledge of i does not give information on the information word m. 

10 The data word d shown in Fig. 1 is encoded using a k x n binary generator 

matrix G so that the data word d=(a(i), m) is mapped on the n-bits code word c(i, m)=O(0> 
m)G as shown in Fig. 2. 

Fig. 3 shows a block diagram of a typical system using encoding and 
decoding. Therein user data, e.g. audio or video data, coming from a data source 1, e.g. 

1 5 recorded on a master tape or master disc, are encoded before they are stored on a data carrier, 
e.g. a disc, or transmitted over a transmission channel, e.g. over the internet, before they are 
again decoded for forwarding them to a data sink 9, e.g. for replaying them. 

The user data of the source 1 are first encoded by a source encoder 2, then 
error correction encoded by an ECC encoder 3 and thereafter modulated by a modulator 4, 

20 e.g. an EFM modulator, before the encoded user data - the code words - are put on the 

channel 5 on which errors may be introduced into the code words. The term "channel" 5 shall 
here interpreted broadly, including a transmission channel as well as storage of the encoded 
data on a data carrier for a later replay. 

When replay of data is intended, the encoded data first have to be demodulated 

25 by a demodulator 6, e.g. an EFM demodulator, before they are error correction decoded by an 
ECC decoder 7 and source decoded by a source decoder 8. Finally, the decoded user data can 
be input to the sink 9, e.g. a player device for replay of the user data. 

The method of decoding according to the present invention shall be explained 
more in detail with reference to Fig. 4. It shall be assumed that data stored in encoded form 

30 on the record carrier 10 shall be replayed. In a first step, an amount of data r are read by a 

reading unit 1 1 and forwarded to an encoding apparatus 12. During its way from the encoder 
to the decoder errors might be introduced into code words, e.g. by scratches on an optical 
record carrier or by transmission errors, so that the read code words r are possibly mutilated. 
Those errors shall be corrected by the decoder 12. 
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At first, the difference D of code words situated in consecutive sectors (or 
located consecutively in a transmitted data stream) is computed in unit 13. The difference D 
of code words v\ and r i+ i in the sectors i and i+1 is computed as follows: 

c(i, mi)0c(i+l, m 2 ) = (A(i), mi©m 2 )G, 

5 wherein A(i)=a(i)©a(i+1) for 0<i<2 b -2. The key observation is that for appropriate choices of 

the address word a, much can be said of the difference A(i) of two consecutive address words. 

It should be noted that © indicates a modulo -2 operation in which additions and subtractions 

have the same result. 

Assuming that corrupted versions n, r 2 , r 3 of L = 3 consecutive code words ci, 

10 c 2 , c 3 are read one can write for j = 1, 2, 3: 

rj = c(i+j-l, mj)©ej = (a(i), m^G©^, 
e } representing an error vector. It is clear that 

D 12 = r,©r 2 = (a(i), nn)G©(a(i+l), m 2 )G©(e,© e 2 ) = (A(i), m,©m 2 )G©(e 1 ©e 2 ) and 
D 23 = r 2 ©r 3 = (A(i+1), m 2 ©m 3 )G©(e 2 ©e 3 ). 
15 These L-l = 2 differences D i2 and D 23 are computed by unit 13 and inputted into the first 
decoding unit 14. 

In the decoding unit 14 said differences Di 2 , D 23 are decoded each to the 
closest code word from a subcode C which consists of all possible differences of two 
consecutive code words c of the main code C. This could e.g. be done by comparing the 
20 differences Di 2 , D 23 with all possible code words of the subcode C and by selecting the 
closest code word as estimate u for c(i, mi)©c(i+l, m 2 ) and as estimate v for c(i+l, 
m 2 )©c(i+2, m 3 ). Thus, in the first decoding unit 14 estimates u, v for the differences of the 
pairs of code words ci©c 2 and c 2 ©c 3 corresponding to the pairs of possibly mutilated code 
words ri©r 2 and r 2 ©r 3 are obtained. 
25 These estimates u, v are combined in unit 15 by computing 

Wi: = ri =c(i, m0©ei 

w 2 : = r 2 ©u - = ri©(ri©r 2 ©u) 

w 3 : = r 3 ©u©v = r 1 ©(r 1 ©r 2 ©u)ffi(r 2 ©r 3 ©v). 
If the estimate u is correct, then rj©r 2 ©u = ei©e 2 . Similarly, if v is correct, then r 2 ©r 3 ©v = 
30 e 2 ©e 3 . Hence, if the estimates u and v both are correct, then 

wi = c(i, m0©ei 

w 2 = c(i, mi)©e 2 

w 3 = c(i, mi)©e 3 . 



In combining unit 15 a number L = 3 of corrupted versions wi, w 2 , w 3 of the particular code 
word Ci = c(i, mi) are thus obtained. 

Next, in unit 16 next the code vector z is constructed by component-wise 
majority voting of the corrupted versions wi, w 2 , w 3 of the code word cj. That is, for each i e 
5 {1, 2, . . n}, the i-th component Z\ of the code vector z is an erasure, if wn, w 2i and w 3 i are 
distinct; otherwise, the component z\ equals the most frequent element among w^, w 2 i, w 3i . 
The code vector z is then decoded by a second decoding unit 17 for the code C that decodes it 
into a code word c' of said code C. Finally, in unit 18 the generator matrix G, which had been 
used by the encoder to encode the address words and the information words into code words, 
10 is used to finally retrieve the information word m and the address word a embedded in said 
code word c\ 

In general, a number of corrupted versions of L consecutive code words are 
read, say r, = c(i+j-l, mj)©ej for j = 1 , 2, . . L. Estimates for the differences of each of the 
(L-l) pairs of consecutive code words are obtained. By combining these estimates, L 

15 corrupted versions wi, w 2 , . . ., Wl of the code word Ci = c (i, mi) are obtained. If all the 
estimates are correct, then it holds wj = c(i, mi)©ej for j = 1, 2, . . ., L. The code vector z is 
obtained as the majority vote of wi, . . w L in each of the coordinates. If in a certain 
coordinate more than one symbol occurs most frequent, this coordinate in the code vector z is 
erased. Finally, the code vector z is decoded to a code word in the code C. 

20 For reducing decoding delay, it is advantageous if the number L is as small as 

possible. With L = 2, the described method is not appropriate, as majority voting on two 
alternatives is not useful. If reliability information, also known as soft decision information, 
is available on the bits of the possibly mutilated code words rt and r 2 , reliability information 
for each of the bits of w 2 : = r 2 ©u can be obtained according to a well-known method. The 

25 code vector z can now, instead from majority voting, be obtained by replacing setting the 

coordinates zj to the most reliable of the bits from rn and w 2i . For enhancing the decoding of 
the code vector z reliability information could be included for each of the bits of the code 
vector z. The reliability information for bit i in z is obtained by combining the reliability 
information of the bits ru and w 2i . 

30 Next, two special cases shall be briefly discussed. In a first special case, it is 

considered that the information word a(i) is the conventional k-bits binary representation of 
the integer i. For example, if k = 8, then a(57) = 001 1 1001, as 57 = 

0*2 7 +0«2 6 +N2 5 -M«2 4 +W2 3 +0*2 2 +0»2 1 +1«2°. The binary representation of two consecutive 
integers nearly always has the same leftmost bit; the only exception is the address 01 1 ... 1 



that has 10. . .0 as successor. Therefore it can be assumed, with only a very small probability 
of being wrong, that the leftmost bit of the difference of two consecutive address words A(i): 
= a(i)©a(i+l) equals zero. More generally, it will be shown that it is very likely that A(i) 
starts with many zeros. In other words, a solution as proposed in the above mentioned 
5 European patent applications EP 01 201 841 .2 and EP . . . of using a priori known information 
in the decoder can be applied since it is known that a lot of the leftmost information bits of 
A(i) are (with a high probability) equal to zero. 

Let i be an integer between 0 and 2 b ' 2 . Let j be the number of ones in which 
(a)i ends, and further 0 < j < b-1. One can write (a)i = s01 j , where s has length b-j-1, and 1 J 
10 denotes a string of j ones. It can be easily gathered that a(i+l) = slO\ and 
A(i) = a(i+l)©a(i) = O^ 1 l j+1 . 

The following conclusions can be drawn: 

a) for each ie {0, 1,. . ., 2 b -2}, A(i) is of the form 0 b_m l m for some me {1 , 2, . . .,b} . 

b) A(i) = 0 b m l m if and only if a(i) ends in (m-1) ones. 

1 5 From conclusion b) it follows that the number of integers i for which A(i) 

starts with b-m zeros and ends in m ones equals 2 b rn . Stated differently, for m>l, the fraction 
of integers ie {0,1, . . ., 2 b -2} for which A(i) ends in exactly m ones equals 2 b * m /(2 b -lH 1 / 2 ) m . 

For example, the fraction of integers i for which A(i) ends in at most 4 ones 
approximately equals 1/2+1/4+1/8+1/16 = 15/16 = 0.9375. That is, if it is assumed that A(i) 

20 starts with b-4 zeros, one is correct in nearly 94% of the cases. If it is assumed that A(i) starts 
with b-8 zeros, the minimum Hamming distance drops when the idea of using a prior known 
information symbols in the decoder, but the assumption is correct with a much larger 
probability of 255/256 » 0.9961. 

After decoding a corrupted version of (A(i), mi©m 2 )G, i.e. after decoding the 

25 difference Di 2 , it should be checked if the purported value for A(i) is of the form 0"~ m l m for 
some m>l. If not, the decoding result should be rejected. This extra check greatly reduces the 
probability of incorrect decoding, e.g. copyright information. 

According to another special case all sectors have the same information word m. The 
difference between the code words for the sectors i and i+1 can be computed as follows 
30 c(i, m)©c(i+l, m) = (A(i), 0)G. 

In other words, the difference of any two consecutive code words is in the code Ca, defined 
as 

CA={(A(i),0)G|0<i<2 b -2}. 



The difference of corrupted versions of two consecutive code words can therefore be decoded 
to the code C A . The code C A , which is a subcode of the main code C, has small cardinality if 
the set of difference vectors {A(i) 1 0<i<2 b -2} has small cardinality, and in that case, its 
minimum distance may well exceed the minimum distance of the code C. 
5 In the special case that the address word a(i) is the conventional binary 

representation of i, it holds 

C A ={(0 i , l b i , 0 k * b )G|0<i<b-l}. 
Consequently, the subcode C A only contains b words, but not 2 b -l words, and optimal 
decoding can easily be performed by comparing the difference of two consecutive possibly 
10 mutilated words r with all b words from the subcode C A . It should be noted that the number 
of comparisons to be made is linear, not exponential, in b. 

Since A(i) does usually not end in many ones, it might be considered to decode 
to an even smaller subcode, namely 

C' A ={(0\ l w f 0 k - b )G|b'<i <b-l} 
1 5 where b' is an integer between 0 and b-1. The larger b', the smaller C' A , but also the smaller 
the likelihood of correct decoding. 

Also in case that the address representation a corresponds to binary Gray 
encoding, the subcode C A only has b elements. This is because, by definition, binary Gray 
encoding means that two consecutive addresses only differ in one position, that is, for each i, 
20 A(i) consists of one 1 and b-1 zeros. 

The present invention constitutes an effective and reliable method for 
retrieving information stored in code words situated in several consecutive sectors or 
transmitted subsequently in a data stream. It employs certain relationships between 
consecutive code words and jointly decodes several such consecutive code words. The 
25 present solution can be applied in any encoding and decoding system where address words 
having a known relationship are assigned to consecutive code words. 



